A Framework for Bus Trajectory Extraction and Missing Data Recovery for Data Sampled from the Internet

نویسندگان

  • Changfei Tong
  • Huiling Chen
  • Qi Xuan
  • Xuhua Yang
چکیده

This paper presents a novel framework for trajectories' extraction and missing data recovery for bus traveling data sampled from the Internet. The trajectory extraction procedure is composed of three main parts: trajectory clustering, trajectory cleaning and trajectory connecting. In the clustering procedure, we focus on feature construction and parameter selection for the fuzzy C-means clustering method. Following the clustering procedure, the trajectory cleaning algorithm is implemented based on a new introduced fuzzy connecting matrix, which evaluates the possibility of data belonging to the same trajectory and helps detect the anomalies in a ranked context-related order. Finally, the trajectory connecting algorithm is proposed to solve the issue that occurs in some cases when a route trajectory is incorrectly partitioned into several clusters. In the missing data recovery procedure, we developed the contextual linear interpolation for the cases of missing data occurring inside the trajectory and the median value interpolation for the cases of missing data outside the trajectory. Extensive experiments are conducted to demonstrate that the proposed framework offers a powerful ability to extract and recovery bus trajectories sampled from the Internet.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A statistical analysis framework for bus reliability evaluation based on AVL data: A case study of Qazvin, Iran

Reliability is a fundamental factor in the operation of bus transportation systems for the reason that it signifies a straight indicator of the quality of service and operator’s costs. Todays, the application of GPS technology in bus systems provides big data availability, though it brings the difficulties of data preprocessing in a methodical approach. In this study, the principal component an...

متن کامل

Investigating the missing data effect on credit scoring rule based models: The case of an Iranian bank

Credit risk management is a process in which banks estimate probability of default (PD) for each loan applicant. Data sets of previous loan applicants are built by gathering their data, and these internal data sets are usually completed using external credit bureau’s data and finally used for estimating PD in banks. There is also a continuous interest for bank to use rule based classifiers to b...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

Lane Change Trajectory Model Considering the Driver Effects Based on MANFIS

The lane change maneuver is among the most popular driving behaviors. It is also the basic element of important maneuvers like overtaking maneuver. Therefore, it is chosen as the focus of this study and novel multi-input multi-output adaptive neuro-fuzzy inference system models (MANFIS) are proposed for this behavior. These models are able to simulate and predict the future behavior of a Dri...

متن کامل

Application of the Response Surface Methodology for the Optimization of the Aqueous Enzymatic Extraction of Pistacia Khinjuk Oil

ABSTRACT: Aqueous enzymatic extraction of oil from pistacia khinjuk was performed using cellulase. The central composite design was used to optimize the parameters that are significant to the process. The influence of three regressors on the percentage of oil recovery from seed was evaluated using second-order polynomial multiple regression model. Analysis of variance showed a high coefficient ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 17  شماره 

صفحات  -

تاریخ انتشار 2017